Knowledge discovery from sequential data
نویسنده
چکیده
A new framework for analyzing sequential or temporal data such as time series is proposed. It differs from other approaches by the special emphasis on the interpretability of the results, since interpretability is of vital importance for knowledge discovery, that is, the development of new knowledge (in the head of a human) from a list of discovered patterns. While traditional approaches try to model and predict all time series observations, the focus in this work is on modelling local dependencies in multivariate time series. This makes it possible to deal with irregular or chaotic series. The proposed discovery process consists of (1) time series abstraction to get a representation close to the human perception of time series, (2) the enumeration and ranking of qualitative relationships in the data, (3) the specialization with quantitative constraints and the generalization of patterns to overcome limitations that are implicitly induced by the search bias. Zusammenfassung: In dieser Arbeit wird ein Ansatz zur Analyse sequentieller oder zeitlicher Daten (etwa Zeitreihen) vorgestellt. Er unterscheidet sich von anderen Ansätzen in der besonderen Berücksichtigung der Interpretierbarkeit der Ergebnisse, weil dies für die Wissenentdeckung, also die Entwicklung neuen Wissens (im Kopf eines Menschen) aus einer Liste von entdeckten Mustern, entscheidend ist. Während traditionelle Ansätze versuchen, ein globales zugrundeliegendes Modell für die gesamte Zeitreihe zu finden, liegt der Schwerpunkt hier auf der Modellierung lokaler Zusammenhänge. Dadurch können auch irreguläre oder chaotische Systeme untersucht werden. Der Prozeß besteht aus (1) der Abstraktion der Zeitreihen, um der Art der Wahrnehmung duch den Menschen besser gerecht zu werden, (2) der Aufzählung und Bewertung qualitativer Zusammenhänge in den Daten, (3) der Spezialisierung mit quantitativen Nebenbedingungen und der Generalisierung von Mustern zur Überwindung von Einschränkungen, die implizit durch die Definition des Suchraumes gegeben sind.
منابع مشابه
Balanced Constraint Measure Algorithm to Preserve Privacy from Sequential Rule Discovery
Preservation of needed privacy from mining algorithms (data mining methods which extract information from the privacy diffusion of people and organizations) is an emerging research area. Researchers are creating procedures to maintain a proper balance between maintaining information privacy and knowledge discovery by using data mining. In this paper, we initially use the prefixspan algorithm to...
متن کاملFuzzy multiple-level sequential patterns discovery from customer transaction database
Sequential pattern discovery is a very important research topic in data mining and knowledge discovery and has been widely applied in business analysis. Previous works were focused on mining sequential patterns at a single concept level based on definite and accurate concept which may not be concise and meaningful enough for human experts to easily obtain nontrivial knowledge from the rules dis...
متن کاملData Mining in Sequential Pattern for Asynchronous Periodic Patterns
Data mining is becoming an increasingly important tool to transform enormous data into useful information. Mining periodic patterns in temporal dataset plays an important role in data mining and knowledge discovery tasks. This paper presents, design and development of software for sequential pattern mining for asynchronous periodic patterns in temporal database. Comparative study of various alg...
متن کاملA Rough Sets Partitioning Model for Mining Sequential Patterns with Time Constraint
now a days, data mining and knowledge discovery methods are applied to a variety of enterprise and engineering disciplines to uncover interesting patterns from databases. The study of Sequential patterns is an important data mining problem due to its wide applications to real world time dependent databases. Sequential patterns are inter-event patterns ordered over a time-period associated with ...
متن کاملKnowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...
متن کاملKnowledge discovery from patients’ behavior via clustering-classification algorithms based on weighted eRFM and CLV model: An empirical study in public health care services
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer ...
متن کامل